[SPARK-10509][PYSPARK] Reduce excessive param boiler plate code#10216
[SPARK-10509][PYSPARK] Reduce excessive param boiler plate code#10216holdenk wants to merge 17 commits intoapache:masterfrom
Conversation
|
Test build #47408 has finished for PR 10216 at commit
|
|
cc @JoshRosen & @yanboliang who have probably had to deal with a fair amount of this boilerplate code in the past. |
|
ping @yanboliang if you have a chance to look at this would be appreciated. |
|
Test build #48492 has finished for PR 10216 at commit
|
|
re-ping @yanboliang if you have any thoughts on this approach. |
|
cc @jkbradley |
python/pyspark/ml/classification.py
Outdated
There was a problem hiding this comment.
LogisticRegression.threshold is a placeholder to make the param appear in the generated doc. After this change, it has more uses, should we also change the annotation of that dummy param?
There was a problem hiding this comment.
oh good point - I'll remove the comments about them just being dummy params.
There was a problem hiding this comment.
I hope we can eliminate the need for these calls in concrete classes' init methods.
|
@holdenk Sorry for late response, please see my inline comments. |
|
Test build #49031 has finished for PR 10216 at commit
|
|
Test build #49032 has finished for PR 10216 at commit
|
|
Test build #49176 has finished for PR 10216 at commit
|
|
Test build #49533 has finished for PR 10216 at commit
|
|
ping @jkbradley if you've got the time to look at this that would be awesome. |
…s and update a new param to also use the new copy syntax instead of duplicated text
|
Test build #50013 has finished for PR 10216 at commit
|
|
@holdenk Sorry for the delay. This looks great, but I was wondering if it could be improved even further: Would it be possible to add a method in the |
|
@jkbradley sounds interesting - would probably want to change the code generated params as well then to match but yah I'll take a crack at that this week. |
|
@holdenk I'm not quite sure what you mean, but I'll comment in a few places to make sure I make my idea clear. |
There was a problem hiding this comment.
This can hopefully remain the same.
|
@jkbradley I mean there is also the shared params code gen, so we would want to update the generated params as well (right now the change only affects the manual params). |
|
Ohh, right! Thanks for remembering. Should I make another pass? |
|
Test build #50122 has finished for PR 10216 at commit
|
|
Just a few small comments now |
…t and setters in tree params codegen, regen shared params
|
@jkbradley thanks addressed those small issues :) |
|
LGTM pending tests. |
|
I hate copy'paste code, so glad to be able to kill some (and not make more of making Python wrappers) :) |
|
Test build #50134 has finished for PR 10216 at commit
|
|
Merging with master Btw, I verified that the doc generates in the same way. |
The current python ml params require cut-and-pasting the param setup and description between the class &
__init__methods. Remove this possible case of errors & simplify use of custom params by adding a_copy_new_parentmethod to param so as to avoid cut and pasting (and cut and pasting at different indentation levels urgh).